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. ■ , SUMMARY 

Hit p«p«r give* *a aoeouat of soma design derisions mada during the development of the 
programming Modula. It — the essential charactcrUtlca of Its implementa- 

tion* on d» PDP-11 caaqraMr, in particular its rtm-tiinaedminlstnttion of processes and the 
medbaaism of rif allhn Tbs paper ends with some comments on tha suitability of the 
PDP-11 for this hlfh level multi p rogr am m l hg l a n guage. 
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INTRODUCTION 

In this report I shall try to do two things: to give an account of how and why certain design 
decisions were made in the development of the programming, language Modula, and to 
explain some details of its implementation on the PDP-1 1 computer. The reader is referred 
to the defining report of Modula 1 which also contains a brief overview of the language, and 

to the companion paper illustrating some, typical applications of Modula.* 

The entire project started out with the modest aim of gaining some experience and 
insight in the field of multiprogramming and device handling. Such insight, it was stipu- 
lated, could well lead to a set of practical rules or .guidelines fdr effective and reliable 
multiprogram system design, that could ultimately consolidate into a discipline. It soon 
became evident that the creation of such a set of rules amounted to nothing less than the 
design of a new ‘language’, i.e. a set of notations and structures in terms of which design 
would proceed. Some progranis were developed in this way and formulated in assembly 
code. This . approach made it possible to experiment with some proposed elements 
on which the design discipline was to evolve, but more significantly it also drove home a 
message that actually should have been known already: that the mere proposal of an 
abstract notations! scheme to design systems (which thereafter must be coded by hand in a 
low-level language) will neither lead’ to reliable products nor to a . significant saving in 
development cost. The necessity of a high-level language with a fully automated compila- 
tion process is unquestionable. The modest learning effort in multiprogramming thus 
slowly evolved into another exercise in language design and implementation. 

Our desire to take over many familiar elements from Pascal is understandable. After all, 
our aim was to concentrate on the novel aspects due to multiprogramming, and not on the 
design of an entirely new language s toto. Yet, the occasion was seized to change several 
details, and in particular to omit certain facilities that are either, complicated to implement, 
or not needed in the context of our original goal, or both. In retrospect, Modula turned out 
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to be a considerably smaller language than Pascal. Yet it provides ample room for experi- 
menting and exploring good techniques of programming, for committing blunders and 
recognizing that mSSfungramming is truly difficult even with a neat and systematic tool at 

ha ?h c development of the compiler was started when a sketchy definition of the language 
had been laid out. The compiler was written in Pascal as a one-pass cross-compiler running 
on the CDC 6400. This approach allowed a relatively quick adaptation to language specifi- 
cation changes, which were rather frequent for a considerable time. Yet the exercise proved 
once againthat, unless a very large part of the language is ‘right at the beginning, such a 
project has a small chance of survivial. It is particularly important to choose the nght 
moment when a language definition is to be stabilized, i.e when a report is to be released 
for programmersTand when a second effort is to be launched aiming at a well-engineered, 

^hif paper ^discusses a few selected topics only, in particular those that distinguish 
Modula from Pascal. These are the module structure and all facilities concerned with con- 
current processes. The discussion of aspects of implementation is restricted to the run-time 
organization; we refrain from commenting on compiler technology. 

MODULES 

The block structure of Algol and Pascal with its facility to declare local objects doesnot 
adequately cover the needs of systems programming. In particular, it does not allow to hide 
objects (or details about them), while they are still in existence. Objects cease to exist, as 
soon as control leaves the procedure (block) to which they are local. To some degree, the 
own variables of Algol 60 incorporate the desired information-hiding property. However, 
the own concept has well-known deficiencies, and a different solution has to be found. 

A very mudh more appropriate solution was offered by the class concept of Simula as 
modified by Brinch Hansen and Hoare.*" 8 The class definition defines a set of procedures 
(operators) and a set of variables to which only these procedures have access. The important 
aspect is that these variables continue to exist if control leaves any of these procedures. 

Whv then, did we not adopt the class structure in Modula ? The primary reason is that 
the unit of program which has to encapsulate information from its environment— now 
called a module— appears to be of relatively large size in systems programming, and that 
usually only one instance of such a module exists. This is in contrast to the original aim of 
the class concept, where there exist many such objects of identical structure. They form a 
class The class embodies the idea of an abstract data type to be declared in conjunction 
with its applicable operators.** 7 The module, however, rather pursue* the aim of an adequate 
facility to declare such entities as, for example, a scanner in a compiler, a disk store manager 
in an operating system or a communication line handler in a data station. Hence, the 
module has somewhat different objectives than the class and it would therefore be mis- 
leading to claim that the module concept replaces the class concept, although it can be 
shown that formally the former covers the latter. We shall show how a class definition can be 
represented by a module declaration, and then discuss the advantages and disadvantages 

of the two structures. 

Consider the following class definition 4 : 
type C « class 

vmr x, y: T; 
procedure p(U); 
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begin ... end; 
procedure entry q(V); 
begin ... end; 
begin S 
end 

This is expressed in terms of a module as follows: 

module M; 
define R, q, s; 

type R - record x, y: T end; 
procedure p(v«r r: R; U); 

begin ... end p; 
procedure q(var r: R; V); 

begin ... end q; 
procedure s(m r: R); 
begin S end s 
end M 

The advantage of the is evident, if we create several i n s t ances of the data structure, 
n eb f onai»tin g of the two component* x and y: 

var a, b: C 

Thereby the initialization statement S is implicitly invoked once for a and once for b. 
An invocation of procedure q, applied to a is expressed conveniently as 

a.q(v) 

where cl appears as a parameter in a distinguished position. In the case of the module 
structure, the two variables are declared as 

▼or a, b: R 

and the call as 

q(». ▼) 

Initialization must be stated explicitly as s(a). The principal advantage of the class 
notation appears, if there are several classes, perhaps with the same identifiers for some of 
their individual operators: their names are entirely local, and their identity is determined by 
the prefixed parameter, whose type uniquely specifies one class. 

It is not our intention to belittle these advantages, but the sophisticated combination of 
using a distinguished parameter position to identify the scope in which its operator is 
defined is of much lesser importance in the application area envisaged for modules. It seems 
only natural that the designer of a system chooses unique names for all operators exported 
from different (parallel, not nested) modules. The primary advantage is then one of con- 
ceptual simplification: the module has one and only one function, namely to establish a 
static scope of identifiers, whose a cross boundary visibility of identifiers is strictly under the 
programmer’s control. Export (and import) rules are not restricted to some operators 
(procedures), but apply identically to names of any kind of object, such as constants, types 
and variables. This generality of scope rules has in practice proved to be highly valuable. 

If only one instance of a class variable is to appear, then of course the module notation is 
equally simple, if not simpler. The declaration 
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module M; 

define q; ■ ' u ' ; 

rar x, y: T; ^' r 

procedure q(V); 

begin S • • 

replaces both the dais definition and its instantiation a : C, and the caU q(v) replaces a.q(v). 
In the case of multiple instances the module notation is more cumbersome, but also provides 
some additional possibilities, such as, for example, the introduction pf variables common to 
all inatanr* operators declared in the module heading and initialized in its body. 

Implementation of the module is, considering the additional possibilities, simpler than 
that of the After all, the only conceptual innovation of the module lies in delimiting the 
scope of identifiers. In the compiled code, there appears no trace of the module structure 
itself. Nevertheless, the additional sophistication of the compiler’s symbol table mechanism 
is rather considerable, and in any case greater than anticipated. It weighs particularly in the 
case of multipass compilers. Unfortunately, it appears to be almost impossible to avoid a 
multipass scheme. This is not only in view of Modula’s application to minicomputers with 
limited store size, but primarily because of the potential desirability of cross-references of 

objects defined in different modules. . 

There are two issues that gave rise to some discussions: the read-only nature of exported 
variables and the intransparency. of exported types. The latter stems from the desire to 
match the <-!«»« definition’s power of internal structure hiding. Hence, a programmer using 
an exported type (e.g. R in the above example) cannot use any knowledge about the details 
of R, he does not even know, for example whether R is a record or an array structure. 

The decision to declare every exported variable to be a read-only variable is based on the 
view that variables belonging'to a module actually should not be exported at all. Making them 
exportable as read-only variables, however, avoids the cumbersome declaration of function 
procedures that merely yield a variable’s value. 

GENERAL PROCESSES 

The most important and predominant decisions to be taken in the design of a multi- 
programming language implementation are those concerning storage allocation and pro- 
cessor management (scheduling). The guiding criteria in the decisions taken here were 
efficiency and simplicity of addressing (of variables) and of signalling operations. The 
latter are the only programmed operators that may cause the processor to switch from one 
process to another. Another objective was a minimal run-time support routine. 

Storage layout 

The language Modula projects a view of systems consisting of a fixed number of con- 
currently active processes. This concept eliminates the need for a dynamic storage manage- 
ment scheme among processes. But even for primitive systems, this view of an entirely 
static world is somewhat too simplicistic: even simple systems, upon ‘deadstart’, grow from 
an initial nucleus to their ‘operating’ complexity. Modula therefore supports the possibility 
of dynamic process generation (and allocation of their workspace). However, it does not 
support the notion of their dissolution. (In fact, a process may terminate, but this does 
not imply the recycling of its store.) Modula restricts the ability to generate new processes 
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to the main program put. This brings two advantages. First, all complications arising from 
processes having sons (which perhaps survive their ancestors) are avoided. Second after 
the mam program has reached a specific point (usually the end), it is guaranteed that no 
storage overflow can occur. This is crucial m all process control systems, where the main 
program part assumes the role of -an initialization phase. During this phase, storage is 
allocated sequentially as needed by newly generated processes. Each data segment contains 
^header containing a link to another process. All processes are thus linked in a single ring. 
This header, caUed the process descriptor, also represents the state of the process when it is 
not being served by the processor. 



F igure 1. Workspaces of processes linked in a ring 


Naturally, each process has its own stack (work space). Each stack is of fixed size This 

ModufZ h C C ° mpu ;f d bythe com P i l cr » Provided that no procedure is activated recursively 
Modula, however, does allow recursion. In this case, a compiler directive must indicate the 
maximum depth of the recursion. morale tne 

Addressing of variables 

A dominant problem is the addressing of variables, and the efficiency of the overall 
implementation crucially depends on the quality of its solution. Global variables (i e those 
declared in the mam program block) can be addressed directly, as their location is knntn 

IntSin Variabl r ^ 10 addre^“eS K Z lu Z 

rMDt a o ng m^ W Mch ntferaca occur frequently, it i, importtnt that their accm®, 
efficient. Therefore, the origin of the segment of the currently running process is held in a 
processor register (called CP for current process). g P dd m 3 

VanaWes local to a procedure must be addressed relative to that procedure’s data seg- 
ment (also called activation record). The common solution is to stack these segment, in J? 
workspace (stack of the process), and to access segmente d a d^TS 
onginatmg from a renter pointing to the topmost segment. This solution guafJSSs most 
procedural Inch and less efficient access to variables locaHo surrounding 

If ft is nS' A 8Uch . acce88€8 ^ frequent, this kind of sacrifice seems tolerable 

it is not, the mechanism of a display can be introduced which however inm« OOIT J 

reristerl UP ° n - Pr ^ C< i dUre ^ The chief drawback of the described’method is’ that a special 
eS qmrCd d ° ag mth mstruction » to change its value upon each procedure call ^d 

thiTif L DP ' U 8u , bro 1 utine 0811 mechanism uses a suck address register, and ft happens that 

is able m deuS,^ ^ T* " * ”?*** addre#8 for accessin & ^variables. The Lmpt 
t Jr . 0 dete 5 mm e »ts value at each point, and hence also the offset of any variable This 

SET * **** «««*» even with the use of recursion,“er^m^e and 

to «nerate U a £2! r 1?’ ^ 8 ^ U * pr ° gram faik > “ virtuSy impossible 
base addr^ ™thout the presence of a register contaiffing the 

base address of the most recently called procedure. If, due to these considerations 8 we 
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reserve not only a register CP (current prove*), bu, eUo . regie, er PP (procedure pointer), 
we obtain the following addresaing modes: 

1. Global variables: 

2. Variables local to a process: 

3. Variables local to procedure p: 

a. accessed within p: 

b. acces sed from procedures local to p: 

Operations on rignals ^ , utea . It ^ be under execution by the 

A process can be in any ,, , Or it can be waiting to be executed 

processor (OT a processor); ti» ‘ for a , ign al . to be sent; it is called 

as&MU state repiwntation by the de«*iptor is ««ce^ f • ’ Tmow' th« 

descriotor is linked into the queue of descriptors originating at variable s. It foUows war 
signals* are implemented very simply as variables with pointer values (see Figure ). 




Figure 2. Process descriptors representing r> 


Both in the ready end waiting etatce, the d*criptor represent. At £ate ofthe prow*. 
(Not«' PC iurovram counter) and PS (processor status) are not deposited in the descrip 
S 'tat iSK Z top of the (current) suck. The dccriptor contain., however 4e 
!Z!l’ poinSTsP.) The operation, of ending and waiting for a stgnal a are now quite 
straightforward. 

^ ThlpC and PS are dumped on the stack (e.g. by a trap instruction). Thai SF 'is stored 
in *e descriptor, the statTfield is set to 0, 

in the queue s is delinked, and CP is made equal to s. Finally, SP.PCand PS are 
from the delinked process segment, the latter two by an RTI instruction. 

™ pc w ps are dumped on the stack, and SP is stored in the descriptor. The 
statoTfidd obtains the positive value k (denoting waiting), and the last field is used to 
insert the descriptor in the queue starting from s. Insertion has to occur ata P lac ® d ^" 
mined bv the waiting rank k After this operation, any ready process is searched m the ring. 
The ring guarantees* a fair scheduling strategy, because each process gets its torn. In par- 
ticular,^ one entering the waiting state automatically becomes the last one to be resumed. 
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In principle one could replace the ring by a queue of ready processes. This solution is 
better, if the number of processes in a system is large (and most of them are waiting most 
of the time). But, otherwise, the overhead in delinking and reinserting descriptors in queues 
is a significant disadvantage. 

Avaited(s). 

This is a simple test for the queue s to be empty, i.e. a test for the address value ml. 
Interface modules 

The definition of an interface module specifies that at any moment at most one process 
may be executing some procedure defined in the module. 'Executing' here means ‘actively 
executing* , for the definition allows an exception: while a process is actively executing some 
interface procedure, other processes may be waiting, i.e. 'executing' a wait statement 
within the module. Hence, wait statements represent some sing ular points wi thin the 
module that are exempted from the strict mutal exclusion principle. 

The reasoning for this rule, according to Hoare,* goes as follows: If several processes 
simultaneously access common variables, the rules — in particular verification rules — of 
ordinary programming no longer apply. If accesses to common variables are restricted to 
specific areas — critical section, monitors, interface modules — and if mutal exclus ion of 
simultaneous entry to such sections is guaranteed, then we can again deal with the ™mmnn 
rules and laws of uniprogramming. In general, an invariant condition is established for an 
interface module that holds on entry, exit and for each wait statement. Let us denote this 
condition with I. If a communication need arises between two processes PI and P2, then 
this is because a certain other supplementary condition B has arisen in, say, PI that needs 
to be communicated to P2; a signal is sent from PI to P2. Hence we can write 

PI : I and B (send(s)} I 

The signal s may reactivate another process that is waiting for the signal We must 
insist that the receiving process P2 can rely on B to hold when it resumes, for this is essenti- 
ally the message carried by the signal. Hence the verification condition is 

P2: I (wait(s)} I and B 

Since the wait and signal statements occur in an interface module, we insist P2 must 
continue, while PI is delayed until P2 either leaves the module or 'drops into’ another 
wait statement. Hoare insists that at thiB point PI is immediately resumed, because it 
seems intuitively right that an interface module (being a ‘critical region’) should be 
‘occupied’ as rarely as possible (see Figure 3). (Note that the post-condition of the send 
statement does not include B. This is because we wish to allow P2 to invalidate B after it has 
resumed upon receiving the signal.) 

In our implementation, we have not obeyed this advice. Instead, we have exempted send 
statements from the mutal exclusion rule like wait statements. This solution has several 
advantages, but its consequences have not been fully explored. We merely wish to point out 
that no difference arises between our treatment of Bend statements and that of Hoare, if a 
send statement is the last statement of an interface procedure. And thin « the case in most 
examples that are commonly cited. 

The advantage gained by exempting send statements from the mutual exclusion condi- 
tion is that no mechanism for generating the signal s' (see Figure 3) is needed, and that no 
implicit processor switching occurs. A processor switches from one process to another only 
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if exe cutin g an explicit wait or send statement. In fact, this is not only a definite implementa- 
tion Minp lifiratinn, but also an easily comprehensible and helpful language rule. . 

The to assure mutual exclusion is now quite simple, and even trivial if 

Modula is implemented on a single-processor machine: no such mechanism is needed at all. 
If switches are restricted to wait and send statements, all processes except the one currently 
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, , . Figure 3. Processes with signal exchange 

being served; by the single processor are positioned at either a wait or a send statement. 
Since exactly these statements are exempted from mutual exclusion, no violation of the 
(relaxed) mutual exclusion rule can occur. (We have so far left out interrupts from our 
considerations on' purpose.) v 

In principle, the interface module is therefore identical to a regular module (if imple- 
mented on a single-processor machine). It was originally introduced by Brinch Hansen and 
Hoare with the intent that access to variables shared by several processes would be restricted 
to statements- within the module. A compiler may easily check this rule. Again, we have not 
adopted it* because — as shown by the Spacerace program* — there are legitimate applications 
where such a language rule would be unnecessarily restrictive. In most cases it is not, and 
therefore the rule that accesses to shared variables be collected in an interface module is 
highly recommended as a programming principle (rather than a strictly enforced law). 


DEVICE PROCESSES 

An important goal in the development of Modula was to facilitate the effective programming 
of peripheral devices. Such devices usually operate concurrently and therefore not only 
require for genuine multiprogramming but often also constitute the source of real- 

time programming problems. Processor and devices communicate via signals, e.g. starting 
pulses to . initiate a device operation, and interrupts to signal the completion of a device 
operation. The interrupt now appears as an important hardware concept that must be dealt 
with, but cannot be directly expressed in a high-level language based on the notion of 
coherent sequential processes. Instead of trying to ‘make available’ the interrupt in the 
undisguised form of some sort of jumps, actions executed by peripheral devices should 
be described by an explicit device statement (syntactically indistinguishable from ordinary 
statements). The compiler, knowing that such a statement is performed by a device, 
compiles code that initiates the device operation, sets up the interrupt return address and 
releases the central processor. The address is such that an interrupt causes the process to 
resume with the statement following the device statement. Sandmayr 8 has adopted this 
strategy and provides two kinds of device statements, reflecting the simplicity of device 
handling available on his computer (HP 2115). We deviate from this scheme only slightly. 
The device statement does not include device initiation, which instead is expressed by 
separate preceding commands. Hence, it suffices to introduce a single parameterless device 
statement for all purposes. We call it doio. 
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So far, processes could assume two possible states: ready and waiting (for a signal). 
Now a third state must be added: waiting for an interrupt (device signal). Such a simple 
approach should however be rejected, because the operation of processor switching due to 
an interrupt should be kept minimal, i.e. should not be augmented by additional overhead 
incurred merely because of an enforced mechanism of process administration. Interrupt 
handling in Modula should be precisely as efficient as if programmed in machine code. 
Efficiency can be improved by the following consideration: usually a whole sequence of 
device operations acknowledged by interrupts follow in short succession (e.g. reading all 
characters in a line). Hence, a process description may be left in the special third state u nti l 
the entire sequence of interrupts has been received. This idea led to the device-control 
statement in Reference 8, whose component statement S is executed without leaving the 
third state. 

under dev control S end 

S may contain loops, but is subjected to some obvious restrictions, such as exclusion of wait 
and send statements. The process state description then changes from ready to under 
interrupt control upon entry to the device control statement, and back upon exit. (The 
indicated device dev determines the interrupt location to be used wi thin S.) 

The solution adopted in Modula goes even one step further in this direction: each 
statement S to be executed ‘under device control’ must be declared as a process of its own. 
Of course, send and wait statements must then be admitted. A compiler must handle them 
in a different, special way that is designed to achieve the desired efficiency. Such processes 
are then called device processes. It follows that each such process is attached to a single 
device (whose associated interrupt location (or interrupt identification) is specified in its 
heading). This concept reflects the customary practice of associating a single driver with ea ch 
device. The driver or device process corresponds to the familiar concept of an interrupt 
routine. The driver is enclosed in an interface module (called a device module) together 
with all routines that communicate with the device and all details that are pertinent to the 
device. The module then serves to hide these details from the remainder of the program. 

Process representation 

The following details apply to the chosen strategy of a Modula implementation. 

Device processes are identified by a descriptor similar to those of regular processes. 

However, they are not linked into the ring. Their three states are described in Figures 4 
and 5. . ■ 

This solution has the additional advantage that the number of registers to be saved upon 
an interrupt can be kept fairly small. The compiler may be advised, for example, to utilize 
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Figure 4. Device process in running state 
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only two or three work registers in device processes. Moreover, switching of regular pro- 
cesses may be performed without any work register saving and restoring whatsoever. 




zfc 

interrupt 

vector 


Figure 5. Device process m ‘waiting for signaT and 'waiting for interrupt f states 


Mutual exclusion, signalling, and process priorities 

It is customary that during the execution of an interrupt routine further interrupts are 
withheld (delayed). An obvious approach to implement mutual exclusion in connection 
with device processes is therefore simply to shut off interrupts whenever a piece of program 
within a device module is executed. (Remember that a device module is automatically assumed 
to be an interface module.) Does this extremely simple and efficient solution suffice ? 

It does, provided the processor is returned to the interrupted process whenever the 
device process decides to release it. In principle, it must be released in three circumstances 
only: upon executing a doio, a wait or a send statement. If the latter two are implemented 
accordingly — and they can be handled differently from waits and sends in regular pro- 
cesses — the solution is safe. 

Many computers, including the PDP-11, feature an interrupt priority system. Each 
device is given (usually by the hardware) a fixed priority level. An interrupt from a device 
at level i then shuts out interrupts at levels j < i (i.e. not necessarily all interrupts), until the 
program ♦«!»« appropriate action. This notion is rather important in real-time programming, 
and must not be hidden by a high-level language. 

The fact that a device has a certain interrupt priority i in effect means that its associated 
process has a priority i for obtaining the processor. Such a priority is therefore indicated in 
Modula in the heading of each device module. It signifies (in the PDP-11 implementation) 
that all program pieces within the device module are to be executed with that specified 
interrupt masking level. (This is another reason why device modules cannot be nested.) 

The concept of a process priority raises another aspect about the rules of signalling. It 
was argued above that a process sending a signal must release the processor and pass it on 
to the receiving process. Should this also be true, if the sender is a device process, i.e. one 
with a hi gher priority than the receiver ? This would at least be counter-intuitive. We have 
therefore adopted the rule that a send statement executed by a driver does not release the 
processor, but merely marks the receiver as ready. The above quoted axioms about wait 
and send statements therefore hold only provided the device process does not itself invalidate 
the condition signalled. We may formulate this restriction by the following general rule: a 
(device) process must never invalidate a condition signalled to a process of lower priority. 
This, however, does not appear as a handicap, if a driver operates in a simple producer/ 
consumer constellation. And drivers usually satisfy this constraint. On the other hand, the 
advantage gained by this solution is significant, because the number of process switchings 
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is thereby reduced, and the driver incun no delay in reactivating its device. Perhaps the 
morale of this story is that in real-time multiprogramming the same verification arinm« may 
be applied, provided certain additional constraints or rules of progr amming discipline are 
observed. 

The different handling of regular and of device processes makes it necessary for a sender 
to be able to determine the category of the receiver. This requires a test of a descriptor bit. 
This test could be saved, if the compiler could be enabled to perform it. This would, how- 
ever, require the introduction of a second kind of signals. We would then Hi«»ingn; f b 
between signal variables upon which only regular processes may wait, and those on which 
device processes wait. Many programming applications have led to the assumption that such 
a restriction is in practice observed anyway. It even appears advisable to bind every device 
signal variable to one specific process. 


THE NUCLEUS 

We call that part of a Modula program the Nucleus which is identical for every program 
and resides permanently in the computer’s main store. It is that part which embodies the 
mechanism for process administration including the routines for wait and send statements. 

It must be the goal of any language (and of its implementation) for general systems 
programming to keep its nucleus very small. The structure and mechanism of the nucleus 
determine the overall concept of the language, and the bigger it is, the more does it restrict 
the flexibility of programs written in the language. A large nucleus is also an «nmi«»air.Ki, 
symptom for built-in overhead. Nucleus routines should be minimal both in number and in 
size. 

Modula has been designed with this postulate in mind. Its nucleus consists of routines for 
the starting of processes, and for the wait and send statements. Moreover, it contains some 
code for system initialization. The sizes of these routines are 

Routine No. of instr. No. of 'words 


start 

18 

24 

wait 

27(21) 

30 

•end 

16(9) 

20 

initial 

13 

24 


Hence the Nucleus has a total size of 98 words. (We do not consider the process des- 
criptors as belonging to the Nucleus, although they are accessible to Nucleus routines only. 
Each descriptor occupies four words.) The lengths of the calling sequences of these routines 


Routine 

No. of instr. 

No. of words 

start 

4 

7 

wait 

3 

5 

•end 

4 

5 


The length of die calling sequence for the send routine contains a test for the nl gnal value 
‘not awaited’. This avoids a call instruction in case the signal is not expected by any process. 
It appears essential that signalling is very cheap, if it has no effect. 
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The code representing the send, wait and doio statements within device processes is 
expanded inline, and hence does not require any nucleus code. Its length is 

Macro No. of instr. No. of words 


•tart 

5+n 

12+n 

wait 

8 +n 

16+n 

•end 

5 

6 

doio 

8 + 2n 

14+2n 


(n denotes the number of work registers available to device processes. They are saved and restored upon 
interrupt.) 

Considering the brevity of this Nucleus, one is tempted to regard the Nucleus as a fix 
of the given hardware to accommodate the desired multiprogramming concept, rather than 
as a resident software support. It might very well be implemented by suitable micro- 
programs (if that facility is available). 

MISCELLANEOUS DESIGN CONSIDERATIONS 

In this section, several issues will be discussed briefly that arose during the design of the 
language Module and its first implementation. The list of these issues is neither exhaustive 
nor ordered according to their length of debate. 

Statement struc t u res 

As far as the syntax is concerned, Modula deviates from its ancestor Pascal in an essential 
respect: statement structures consistently follow a single guiding rule. Every structured 
statement begins with a keyword (uniquely identifying the kind of structure) and ends with 
a minting symbol. The effect is shown for the while statement: 

Pascal: Modula: 

while B do S while B do S end 

w hile B do while B do SI; S2 end 

begin SI ; S2 end 

The decisive consideration was that, if a statement is to be inserted in a program, no 
other symbols (such as a begin) would have to be inserted (deleted) in addition to the 
statement itself. The rule that every structured statement had to end with a closing symbol 
(usually end) had, however, the following consequence: Consider the frequent case of a 
cas ca d-d conditional such as (written in Pascal or Algol 60) 

if B1 then SI else 
if B2 then S2 else 

if Bn then Sn else S 

This now has to be expressed by the awkward construct 

if B1 then SI else 
if B2 then S2 else 

if Bn then Sn else S 
end 

end 

end 
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It is quite natural in this case to introduce a construct that avoids the nesting of structures 
and eliminates the sequence of end symbols: 

if B1 then SI 
elaif B2 then S2 

elsif Bn then Sn 

else S 

end 

Note that this represents a genuine, clean solution in contrast to the fixup rule that 
‘multiple ends’ may be merged into a single end symbol. 

Jumps and loop structures 

The decision to omit jumps (goto statements) from Modula was a deliberate one; its aim 
is to explore the practicality of goto-free programming. It was taken without any claim to be 
the last word (and acknowledging the plain fact that many programmers will be, happier and 
more at ease when a jump is available to them). It was evident, however, that at least one 
repetitive structure had to be included (in return), namely a loop statement allowing for one 
or several exits anywhere within the sequence of repeated component statements. The 
chosen form of such a loop statement appears as somewhat baroque. In its favour it can be 
stated that it allows the obvious representation of virtually all loop constructs (including 
exit actions specific to each exit point), and that its implementation is quite straightforward. 
It has practically eliminated the need for labels and jumps. A very important use of the 
loop statement is the expression of a non-terminating repetition. 


The data type bits 

A conceptually appealing and efficient representation of bitstrings has been included in 
Pascal: the set structure. It is absent from Modula. This calls for an explanation. Reserva- 
tions against adopting the set structure in Modula arose from the short wordlength in mini- 
computers, which is a natural limit for the cardinality of sets,- if an efficient implementation 
is desired. A representation using multiple words for set variables was finally rejected,' not 
so much on grounds of principle, but because of the desire to keep language size and com- 
piler development effort in proportion to the available resources. 

An analysis of the use of set structures in system programming indicated that access to 
individual bits is primarily needed for the following two purposes: 

1. Operations on device registers, whose stnicture is defined by. the hardware. 

2. Boolean arrays,, such as storage reservation tables. 

This led to the adoption of the single standard data type . 

bits *■ array 0 : 15 of Boolean : =v ■ 

instead of a general set structure. Thereby the introduction of- yet another concept is 
avoided. All operations applicable to Boolean arrays apply- to bits. A compiler can easily 
recognize them as a case where dense data packing and use of particular instructions are 
applicable. As an extension, Modula admits the Boolean operators not only to components 
of bit variables, but to bit variables themselves, implying that they are executed (pairwise) 
on all components. This makes the direct use of the PDF- IT bit-instructions BIC, BIS, 
BIT and XOR possible. ‘ * 
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The acceptance of a standard array type also calls for a denotation of array constants. 
We have restricted such an array constructor facility to the standard type bits, and instead of 
the obvious notation 

[®o* ®i» •••> ®1#1 

where ci denotes the value of component i, we have retained the notation of sets, where 

fa,. Xu ...| xj 

denotes the indices Xi of all components having the value true. Moreover, this constructor 
is restricted: the must be constants. Therefore a compiler can always perform the test 
15 for all elements, and it can directly construct an appropriate bitstring. 

Integer division 

The axiom of division 

x/(-y) - (-x)/y - -(x/y) 

specifies symmetry relative to 0 and is, to the author’s knowledge, observed by all com- 
puters that feature a division instruction. It now happens that most divisions in system 
programming have a small constant as divisor; most frequent is a small power of 2. In 
these cases, a compiler should be able to employ a shift instruction for reasons of efficiency. 
But herewith arises a problem: if a computer UBes the two’s complement representation for 
negative numbers, then division of negative dividends is not symmetric with respect to 
zero. There are the following possibilities for the language designer: 

1. Test for the sign of the operand, i.e. compile code which includes the test(s) and 
corrects the quotient by adding or subtracting 1, if necessary. 

2. In order to mal«» the use of shift operations possible, eliminate the inconsistency by 
forbidding negative operands. 

3. Introduce two distinct division operators in your language. 

Solution 1 is unacceptable, because it more than offsets the efficiency gained by the use of 
a shift instruction. Solution 2 appears as unpleasantly restrictive. Hence, there remains 
solution 3. In Modula, regular division is denoted by / and the division that can be 
implemented by shifts by div. Let x div y - q be the quotient and x mod y « r be the 
remainder. Then this division satisfies the following axiom for all integers x: 

x-»q*y+r and 0<r<y 

Note that the divisor must be strictly positive. Efficient compilation is now possible in 
those ca w where it is most needed: division by small constants. Then the compiler can 
check y>0, it can check whether the use of shift is possible, and otherwise use a division 
instruction with an appropriate corrector. This is necessary for both the div and mod 
operations. 

An important application of the mod instruction is the stepping of an index in a cyclic 
buffer with N elements. If the origin index is 0, then 

i : — (i+1) mod N 
i : — (i — 1) mod N 

represent the Upping up and down of the index respectively. A single masking instruction 
can be used, if N is a power of 2. 
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A dock, a tick or a pause? 

Most modern computers feature a device that interrupts the processor in fixed time 
intervals, usually that of the line frequency. In the usual jargon, this device is called a 
clock. How should the clock be represented in a high-level language ? 

The most obvious way in Modula is to introduce a device module containing a driver 
process that periodically waits for the clock interrupt (denoted by doio) and then sends 
a signal. For example; 

loop doio; send(tick) end 

One is very much tempted to declare such a process and the signal variable tick as 
Standard components of Modula that are implicitly available. We have not done so, because 
frequently a dock process of a more complicated form, e.g. 

loop doio; inc(time); 

while awaited(tick) do send(tick) 

end 

end 

might be preferred (or be necessary). By not declaring the tick signal to be a standard 
variable, the Modula programmer has the freedom (and the burden) to define his own 
dock process explicitly. 

Remark concerning the PDP-11 implementation: 

The restriction that signals must not be exchanged between device processes usually necessitates a 

second, regular dock process. The standardization of a min i m al clock driver as a device process would 

therefore be even more justified. However, this consideration is particular to our PDP-11 implementa- 
tion. 

We have refrained from introducing a standard signal tick, also because another solution 
might appear as more convenient for the programmer, namely the use of a standard pro- 
cedure patue(n) that delays the calling process by n tick intervals. This procedure incarnates 
the genuine purpose of using a dock, namely the delaying of a process by a given time. 
The principal advantage of the pause procedure over the tick signal as a predefined object 
is found in the reduction of the number of process switchings, if properly incorporated in 
the Nudeus. 

When considering implementation techniques, an immediate temptation lies in intro- 
ducing a third process state in addition to ready and waiting (we may call it delayed). 
The parameter n could be stored in a fifth fidd of the process record, and upon each dock 
interrupt, its value would be. decremented for all ddayed processes. A simpler solution lies 
in introducing a hidden signal variable (call it tick), and in interpreting each pause(n) as a 
wait(tick, n) statement. The dock interrupt routine then merdy needs to decrease the 
delay ranks in all linked records, and to reactivate the first process, when its rank has 
reached 0. (The wait statement performs the insertion at the appropriate place.) 

COMMENTS ON THE PDP-11 

Every Modula program constitutes an abstract machine that (apart from some isolated 
parts called device modules) is defined entirdy by the definition of the language Modula. 
Hence, what we call ‘language’ is in effect an infinite collection of abstract machines or, 
more appropriately, a tool kit out of which arbitrary abstract machines can be engineered. 
We wish to make available a universal interpreter of these machines, and therefore build a 
compiler that enables an existing computer, in our case a PDP-11, to function as a universal 

6 
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interpreter. The question then arises, to what degree is the given computer suitable for 
this purpose, or how natural is the structure of the real machine for mirroring that of 
the abstract machine. 

We have chosen the tool kit Modula to reflect the dominant structural aspects of present 
computers quite truthfully, and to hide their peculiarities to a large degree, thereby creating 
a more systematic tool that is easier to comprehend and manage. Our subsequent con- 
siderations therefore focus on particular characteristics of the PDP-11 rather than its 
overall structure. 

Some important points have already been mentioned by Bron,* notably the well-designed 
subroutine call mechanism using a stack and the old-fashioned concept of the condition 
codes. We give some further comments on these issues, and add a few items to the ‘negative 
list’ in the hope that the criticism be understood as constructive and that it may help 
designers of hardware to recognize the issues. 

The virtues of the subroutine call instruction J SR are that it performs what is needed 
(stacking the return address) and (almost) no more. JSR can be used to stack any register R 
and to place the return address into R. This is of some advantage when constant para- 
meters are placed immediately following the jump instruction. This technique stems from 
practices of assembly coding and cannot be used in general. Our compiler does therefore 
not fully utilize the capabilities of die JSR instruction, since it always generates JSR PC,x. 
One detrimental effect in designing the JSR instruction more complicated than needed, 
together with the desire for a matching return instruction, is the MARK instruction. Used 
in conjunction with RTS and added in later PDP-11 models as an afterthought, it is a 
beautiful example of poor ‘fixup engineering’. 

If a procedure p in a device module has to be called, then the current priority level of the 
processor must be saved before the jump is executed. The call is implemented by the two 
instructions 

MOV PS,-(SP) ; stack PS 

JSR PC,p ; stack PC and jump 

It appears that a single instruction to obtain this effect is quite obviously missing, because 
the complementary instruction RTI (which unstacks PC and PS) is present. (The 4 (f) trap 
instructions appear identical to the proposed new instruction with the restriction that they 
jump to a fixed address.) 

The concept of a condition code register serves to record certain frequently used predicates 
of every computed result. As such, we have no objections to it. Objectionable, however, is 
the lack of a suitable instruction that readily transfers these predicate values into data 
registers, preferably in the form of a Boolean value. Such a simple instruction would 
facilitate the compilation of Boolean expressions very considerably. 

Apart from this, it appears that the PDP-11 structure is almost ideally suited for the 
efficient realization of the Modula concept of multiprogramming. This is mainly due to the 
ideas of a stack and SP-register and the way interrupts are implemented. 

A few more remarks are in order concerning the operating of devices. The rule that each 
device has its own interrupt address is eminently sound and crucial. Also, the notion of 
priority levels and its realization are most appropriate. The desire to impose a systematic 
approach to device command and status representation is laudable. Nevertheless, in experi- 
menting with various devices. Knowledge in operating one device was rarely applicable to 
handling another. In particular, are found the following rules crucial, and are postulated as 
axioms for a sound scheme of device handling: 
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1 Each device command (operation) ia known to produce either no interrupt or at least 
one interrupt. 

2. Upon interruption, the interrogatable device status indicates whether or not another 
interrupt will follow. 

3. Each interrupt is due to a distinct, preceding device command. (In other words: no 
interrupt arrives unexpectedly.) 

Each device has one unique interrupt location. (A multiplexor handling several 
terminals, for example, may be considered as a single device.) 

Unfortunately, the PDP-11 and/or its devices disregard these rules quite frequently and 
thereby often relegate the art of device programming into the domain of heuristic experi- 
mentation. For example, we have a printer that requires a dc3 character to start its print- 
chain drive. Sending a dc3 is acknowledged by an immediate interrupt (what for ?), and a 
S econd interrupt follows when the chain has reached its regular speed, but only if the chain 
had not been running in the first place. There is no way to interrogate the chain drive 
status. Most output devices violate Rule 3: they interrupt the processor when the interrupt 
is initially enabled. Also, the card reader sends an unexpected interrupt when it is switched 
on. These have caused us to adopt the strict rule to switch off the interrupt enable 
bit after every interrupt (doio). This simply should not be necessary. 

Because the PDP-11 attempts to systematize and unify device handling more than any 
other ™mpnt *r known to the author, deviations from the basic scheme are particularly 
visible and annoying. One such device is the GT-40 display unit. Its interrupts can trap to 
different locations (Rule 4), whether an interrupt is sent after device completion cannot be 
determined by the main processor, and interruptability cannot be enabled or disabled by 
setting or resetting a bit in a device register. The display unit’s and the main processor’s 
designs not only seem to originate from different engineers, but also from different eras of 
programming and different epochs of computer architecture. 

IN RETROSPECT 

I have tried to report on project Modula by not merely offering the final result (which may 
be not so final after all), but by describing the deliberations and considerations that led to it. 
In retrospect I recognize how ambitious this aim was. The design of a progra mming 
langua ge and its compiler is a very large task in general. But if the subject is new, largely 
lacking a solid scientific basis, and influenced by many rapid developments in technology, 
such a project is a risky foray with many surprises. Hence, its evolution cannot progress in 
a straight line. It is characterized by many short paths of exploration followed by retreats. 
In writing a report such as this, one is not only tempted but forced to ignore many of these 
’incidents’, although some of them still leave their traces. I should therefore warn the 
reader that the development has not been as straightforward as it may appear from this 
account, and that many design decisions have been preceded by long arguments which helped 
to clarify the motivations underlying the various alternatives. 

The most important lesson perhaps is that conscious design decisions based on conscious 
motivations are very strongly influenced by the intended field of application. Being unable 
to consciously delimit this intended space of application is possibly the dominant reason 
for the uncontrollable growth of language diversity and complexity. 1 * If a language proves 
to be only marginally suitable for some application that was obviously not envisaged by its 
originator, we should muster the coinage to build a new, truly adequate tool, instead of just 
grafting a fix onto the existing one. 
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